A Perceptual Expressivity Modeling Technique for Speech Synthesis Based on Multiple-Regression HSMM
نویسندگان
چکیده
This paper describes a technique for modeling and controlling emotional expressivity of speech in HMM-based speech synthesis. A problem of conventional emotional speech synthesis based on HMM is that the intensity of an emotional expression appearing in synthetic speech completely depends on the database used for model training. To take into account the emotional expressivity that listeners actually perceive, the perceptual expressivity scores are introduced into a style control technique based on multiple-regression hidden semi-Markov model (MRHSMM). The objective and subjective evaluation results show that the proposed technique works well when there is a large bias of emotional expressivity in the training data.
منابع مشابه
A style control technique for singing voice synthesis based on multiple-regression HSMM
This paper proposes a technique for controlling singing style in the HMM-based singing voice synthesis. A style control technique based on multiple regression HSMM (MRHSMM), which was originally proposed for the HMM-based expressive speech synthesis, is applied to the conventional technique. The idea of pitch adaptive training is introduced into the MRHSMM to improve the modeling accuracy of fu...
متن کاملPerformance evaluation of style adaptation for hidden semi-Markov model based speech synthesis
This paper describes a style adaptation technique using hidden semi-Markov model (HSMM) based maximum likelihood linear regression (MLLR). The HSMM-based MLLR technique can estimate regression matrices for affine transform of mean vectors of output and state duration distributions which maximize likelihood of adaptation data using EM algorithm. In this study, we apply this adaptation technique ...
متن کاملA technique for controlling voice quality of synthetic speech using multiple regression HSMM
This paper describes a technique for controlling voice quality of synthetic speech using multiple regression hidden semi-Markov model (HSMM). In the technique, we assume that the mean vectors of output and state duration distribution of HSMM are modeled by multiple regression with a parameter vector called voice quality control vector. We first choose three features for controlling voice qualit...
متن کاملA style control technique for speech synthesis using multiple regression HSMM
This paper presents a technique for controlling intuitively the degree or intensity of speaking styles and emotional expressions of synthetic speech. The conventional style control technique based on multiple regression HMM (MRHMM) has a problem that it is difficult to control phone duration of synthetic speech because HMM has no explicit parameter which models phone duration appropriately. To ...
متن کاملMLLR adaptation for hidden semi-Markov model based speech synthesis
This paper describes an extension of maximum likelihood linear regression (MLLR) to hidden semi-Markov model (HSMM) and presents an adaptation technique of phoneme/state duration for an HMM-based speech synthesis system using HSMMs. The HSMM-based MLLR technique can realize the simultaneous adaptation of output distributions and state duration distributions. We focus on describing mathematical ...
متن کامل